Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
نویسندگان
چکیده
منابع مشابه
Improving Lip-reading with Feature Spac Audio-Visual Speech R
In this paper we investigate feature space transforms to improve lip-reading performance for multi-stream HMM based audio-visual speech recognition (AVSR). The feature space transforms include non-linear Gaussianization transform and feature space maximum likelihood linear regression (fMLLR). We apply Gaussianization at the various stages of visual front-end. The results show that Gaussianizing...
متن کاملImproving visual features for lip-reading
Automatic speech recognition systems that utilise the visual modality of speech often are investigated within a speakerdependent or a multi-speaker paradigm. That is, during training the recogniser will have had prior exposure to example speech from each of the possible test speakers. In a previous paper we highlighted the danger of not using different speakers in the training and test sets, an...
متن کاملFinding phonemes: improving machine lip-reading
In machine lip-reading there is continued debate and research around the correct classes to be used for recognition. In this paper we use a structured approach for devising speaker-dependent viseme classes, which enables the creation of a set of phoneme-to-viseme maps where each has a different quantity of visemes ranging from two to 45. Viseme classes are based upon the mapping of articulated ...
متن کاملAutomatic lips reading for audio-visual speech processing and recognition
This contribution is about the method for automatic lips reading from the video picture. The results of this automatic method are used for the next audio-visual speech processing and recognition. The simple image processing method for finding of the human face in the video picture is presented here. The lips are found from the marked human face in the region of interest, where the lips are, wit...
متن کاملHearing lips: gamma-band activity during audiovisual speech perception.
Auditory pattern changes have been shown to elicit increases in magnetoencephalographic gamma-band activity (GBA) over left inferior frontal cortex, forming part of the putative auditory ventral "what" processing stream. The present study employed a McGurk-type paradigm to assess whether GBA would be associated with subjectively perceived changes even when auditory stimuli remain unchanged. Mag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2020
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v34i04.6174